This is a notebook for I’m Something of a Painter Myself. The task of this project is to convert landscape images into monet style images. It is unpaired image to image translation task which requires CycleGAN training.
target
Target of this project is to achieve FID < 100 in Kaggle Score. In addition, I would like to visually check how my cycle gan works. Therefore, following objectives should be achieved too.
import numpy as np
import pandas as pd
import cv2
import matplotlib.pyplot as plt
import seaborn as sns
import os
import time
import gc
import tensorflow as tf
from keras import backend as K
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
import tensorflow_addons as tfa
There are 300 Monet images and 7038 Landscape images. Size of those images are 256x256x3.
data source
The original data is available in kaggle competition website.
https://www.kaggle.com/competitions/gan-getting-started/data
dir_monet = "/kaggle/input/gan-getting-started/monet_jpg/"
dir_photo = "/kaggle/input/gan-getting-started/photo_jpg/"
files_monet = os.listdir(dir_monet)
files_photo = os.listdir(dir_photo)
n1 = len(files_monet)
n2 = len(files_photo)
print("Number of Monet images = ", n1)
print("Number of Landscape images = ", n2)
Number of Monet images = 300 Number of Landscape images = 7038
convert_img is the function to convert normalized image to original scale. plot_photo visualize 36 images. plot_OneLine plots 6 images in one line.
# Function to convert normalized image to original scale
def convert_img(image):
img2 = ((np.array(image) + 1)*127.5).astype(int)
img2[img2 > 255] = 255
img2[img2 < 0] = 0
img2 = img2.astype(np.uint8)
return img2
def plot_photo(image):
img2 = ((np.array(image) + 1)*127.5).astype(int)
img2[img2 > 255] = 255
img2[img2 < 0] = 0
fig, ax = plt.subplots(6,6, figsize = (15,15))
N = 36
for k in range(N):
i = int(k/6)
j = k % 6
ax[i,j].imshow(img2[k])
ax[i,j].tick_params(left = False, right = False , labelleft = False , labelbottom = False, bottom = False)
def plot_OneLine(image, title):
img2 = ((np.array(image) + 1)*127.5).astype(int)
img2[img2 > 255] = 255
img2[img2 < 0] = 0
fig, ax = plt.subplots(1,6, figsize = (15,2.5))
N = 6
for k in range(N):
ax[k].imshow(img2[k])
ax[k].tick_params(left = False, right = False , labelleft = False , labelbottom = False, bottom = False)
ax[0].set_title(title)
I usually use OpenCV for reading images. Since OpenCV reads images in BGR scale, it shall be transformed into RGB. Thus, cv2.imread function is followed by [:,:,::-1]. Only 100 Landscape images are read at the beginning to reduce RAM usage. They are used for only validation during cycle gan training.
n2 = 100
M_data = np.zeros((n1,256,256,3)).astype(np.uint8)
F_data = np.zeros((n2,256,256,3)).astype(np.uint8)
for i in range(n1):
M_data[i] = cv2.imread(dir_monet + files_monet[i])[:,:,::-1]
for i in range(n2):
F_data[i] = cv2.imread(dir_photo + files_photo[i])[:,:,::-1]
Images are normalized by following code.
M_data = M_data/(255/2) - 1
F_data = F_data/(255/2) - 1
real_A = M_data[0:36]
real_B = F_data[0:36]
Monet images
plot_photo(M_data)
Landscape images
plot_photo(F_data)
gc.collect()
90685
In cycle gan, there are some common components in generators and discriminators. It is convenient to define those components as functions.
FeatureMapBlock is feature extractor which does not change image size. It is also used at the end of generator to improve quality of generated images. ContractingBlock consists of Conv2D, InstanceNormalization and activation fuction which reduces image size just like typical CNN. ResidualBlock has two layers of Conv2D and InstanceNormalization. Original Input is skipped to the end of the block which minitage dead neuron problem. ExandingBlock consists of Conv2DTranspose which increases image size. It is followed by InstanceNormalization and activation function. For CycleGan model, InstanceNormalization is selected instead of BatchNormalization, because batch size = 1 in CycleGan training.
def FeatureMapBlock(channel, X, final = False):
if final:
X = layers.Conv2D(channel, kernel_size = 7, padding = "same", activation = "tanh")(X)
else:
X = layers.Conv2D(channel, kernel_size = 7, padding = "same")(X)
return X
def ContractingBlock(channel, X, relu = True, ksize = 3, use_bn = True):
X = layers.Conv2D(channel, kernel_size = ksize, strides = 2, padding = "same")(X)
if use_bn:
X = tfa.layers.InstanceNormalization()(X)
if relu:
X = layers.ReLU()(X)
else:
X = layers.LeakyReLU(alpha=0.2)(X)
return X
def ResidualBlock(channel, X, relu = True):
X_original = X # skip connection
X = layers.Conv2D(channel, kernel_size = 3, padding = "same")(X)
X = tfa.layers.InstanceNormalization()(X)
if relu:
X = layers.ReLU()(X)
else:
X = layers.LeakyReLU(alpha=0.2)(X)
X = layers.Conv2D(channel, kernel_size = 3, padding = "same")(X)
X = tfa.layers.InstanceNormalization()(X)
return X + X_original
def ExpandingBlock(channel, X, ksize):
X = layers.Conv2DTranspose(channel, kernel_size=ksize, strides=2, padding="SAME")(X)
X = tfa.layers.InstanceNormalization()(X)
X = layers.LeakyReLU(alpha=0.2)(X)
return X
Generator consists of following elements:
tanh)This structure is common for the both of generators AB and BA.
def Generator():
Input = layers.Input(shape=(256, 256, 3))
channel = 64
X = FeatureMapBlock(channel, Input, final = False) #
X = ContractingBlock(channel*2, X, False, 3)#128
X = ContractingBlock(channel*4, X, False, 3)#64
#X = ContractingBlock(channel*4, X, True, 3)#32
X = ResidualBlock(channel*4, X, False)
X = ResidualBlock(channel*4, X, False)
X = ResidualBlock(channel*4, X, False)
#X = ExpandingBlock(channel*4, X, 5)#64
X = ExpandingBlock(channel*2, X, 5)#128
X = ExpandingBlock(channel, X, 5)#256
X = FeatureMapBlock(3, X, final = True)
model = Model(inputs = Input, outputs = X)
return model
gen_AB = Generator()
gen_BA = Generator()
gen_AB.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 256, 256, 3 0 []
)]
conv2d (Conv2D) (None, 256, 256, 64 9472 ['input_1[0][0]']
)
conv2d_1 (Conv2D) (None, 128, 128, 12 73856 ['conv2d[0][0]']
8)
instance_normalization (Instan (None, 128, 128, 12 256 ['conv2d_1[0][0]']
ceNormalization) 8)
leaky_re_lu (LeakyReLU) (None, 128, 128, 12 0 ['instance_normalization[0][0]']
8)
conv2d_2 (Conv2D) (None, 64, 64, 256) 295168 ['leaky_re_lu[0][0]']
instance_normalization_1 (Inst (None, 64, 64, 256) 512 ['conv2d_2[0][0]']
anceNormalization)
leaky_re_lu_1 (LeakyReLU) (None, 64, 64, 256) 0 ['instance_normalization_1[0][0]'
]
conv2d_3 (Conv2D) (None, 64, 64, 256) 590080 ['leaky_re_lu_1[0][0]']
instance_normalization_2 (Inst (None, 64, 64, 256) 512 ['conv2d_3[0][0]']
anceNormalization)
leaky_re_lu_2 (LeakyReLU) (None, 64, 64, 256) 0 ['instance_normalization_2[0][0]'
]
conv2d_4 (Conv2D) (None, 64, 64, 256) 590080 ['leaky_re_lu_2[0][0]']
instance_normalization_3 (Inst (None, 64, 64, 256) 512 ['conv2d_4[0][0]']
anceNormalization)
tf.__operators__.add (TFOpLamb (None, 64, 64, 256) 0 ['instance_normalization_3[0][0]'
da) , 'leaky_re_lu_1[0][0]']
conv2d_5 (Conv2D) (None, 64, 64, 256) 590080 ['tf.__operators__.add[0][0]']
instance_normalization_4 (Inst (None, 64, 64, 256) 512 ['conv2d_5[0][0]']
anceNormalization)
leaky_re_lu_3 (LeakyReLU) (None, 64, 64, 256) 0 ['instance_normalization_4[0][0]'
]
conv2d_6 (Conv2D) (None, 64, 64, 256) 590080 ['leaky_re_lu_3[0][0]']
instance_normalization_5 (Inst (None, 64, 64, 256) 512 ['conv2d_6[0][0]']
anceNormalization)
tf.__operators__.add_1 (TFOpLa (None, 64, 64, 256) 0 ['instance_normalization_5[0][0]'
mbda) , 'tf.__operators__.add[0][0]']
conv2d_7 (Conv2D) (None, 64, 64, 256) 590080 ['tf.__operators__.add_1[0][0]']
instance_normalization_6 (Inst (None, 64, 64, 256) 512 ['conv2d_7[0][0]']
anceNormalization)
leaky_re_lu_4 (LeakyReLU) (None, 64, 64, 256) 0 ['instance_normalization_6[0][0]'
]
conv2d_8 (Conv2D) (None, 64, 64, 256) 590080 ['leaky_re_lu_4[0][0]']
instance_normalization_7 (Inst (None, 64, 64, 256) 512 ['conv2d_8[0][0]']
anceNormalization)
tf.__operators__.add_2 (TFOpLa (None, 64, 64, 256) 0 ['instance_normalization_7[0][0]'
mbda) , 'tf.__operators__.add_1[0][0]']
conv2d_transpose (Conv2DTransp (None, 128, 128, 12 819328 ['tf.__operators__.add_2[0][0]']
ose) 8)
instance_normalization_8 (Inst (None, 128, 128, 12 256 ['conv2d_transpose[0][0]']
anceNormalization) 8)
leaky_re_lu_5 (LeakyReLU) (None, 128, 128, 12 0 ['instance_normalization_8[0][0]'
8) ]
conv2d_transpose_1 (Conv2DTran (None, 256, 256, 64 204864 ['leaky_re_lu_5[0][0]']
spose) )
instance_normalization_9 (Inst (None, 256, 256, 64 128 ['conv2d_transpose_1[0][0]']
anceNormalization) )
leaky_re_lu_6 (LeakyReLU) (None, 256, 256, 64 0 ['instance_normalization_9[0][0]'
) ]
conv2d_9 (Conv2D) (None, 256, 256, 3) 9411 ['leaky_re_lu_6[0][0]']
==================================================================================================
Total params: 4,956,803
Trainable params: 4,956,803
Non-trainable params: 0
__________________________________________________________________________________________________
tf.keras.utils.plot_model(gen_AB)
Discriminator consists of following elements:
This structure is common for both discriminator A and B.
def Discriminator():
Input = layers.Input(shape=(256, 256, 3))
channel = 64
X = FeatureMapBlock(channel, Input, False) #
X = ContractingBlock(channel*1, X, False, 5, False) #128
X = ContractingBlock(channel*2, X, False, 5) #64
X = ContractingBlock(channel*4, X, False, 5) #32
X = layers.Conv2D(1, kernel_size = 1, padding = "same")(X) #32
model = Model(inputs = Input, outputs = X)
return model
disc_A = Discriminator()
disc_B = Discriminator()
disc_A.summary()
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 256, 256, 3)] 0
conv2d_20 (Conv2D) (None, 256, 256, 64) 9472
conv2d_21 (Conv2D) (None, 128, 128, 64) 102464
leaky_re_lu_14 (LeakyReLU) (None, 128, 128, 64) 0
conv2d_22 (Conv2D) (None, 64, 64, 128) 204928
instance_normalization_20 ( (None, 64, 64, 128) 256
InstanceNormalization)
leaky_re_lu_15 (LeakyReLU) (None, 64, 64, 128) 0
conv2d_23 (Conv2D) (None, 32, 32, 256) 819456
instance_normalization_21 ( (None, 32, 32, 256) 512
InstanceNormalization)
leaky_re_lu_16 (LeakyReLU) (None, 32, 32, 256) 0
conv2d_24 (Conv2D) (None, 32, 32, 1) 257
=================================================================
Total params: 1,137,345
Trainable params: 1,137,345
Non-trainable params: 0
_________________________________________________________________
tf.keras.utils.plot_model(disc_A)
def get_disc_loss(real_pred, fake_pred, adv_criterion):
ones_label = tf.ones_like(real_pred)
zeros_label = tf.zeros_like(fake_pred)
disc_loss = (adv_criterion(ones_label, real_pred) + adv_criterion(zeros_label, fake_pred))/2
return disc_loss
There are three types of generator losses. First, get_gen_adversarial_loss calculate adv loss of a generator. It takes prediction (by discriminator) of fake images. True label is always 1 in this case. Then loss is calculated based on adv_criterion
get_cycle_consistency_loss compare cycled image with the original one. Typically MAE is selected as cycle_criterion. get_identity_loss has similar criterion. It calculates gap between original image with generated image. For example, if original is Monet, then we need to make sure that Monet image does not change after going through generator_Landscape_Monet (gen_BA).
def get_gen_adversarial_loss(fake_pred, adv_criterion):
#label is always one
ones_label = tf.ones_like(fake_pred)
loss = adv_criterion(ones_label, fake_pred)
return loss
def get_cycle_consistency_loss(real_X, cycle_X, cycle_criterion):
cycle_loss = cycle_criterion(real_X, cycle_X)
return cycle_loss
def get_identity_loss(real_X, identity_X, identity_criterion):
identity_loss = identity_criterion(real_X, identity_X)
return identity_loss #, identity_X
Optimizers
Training cycle gan means training four models: generator AB and BA, discriminator A and B. Threfore, it needs four optimizers.
optimizer_disc_A = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
optimizer_disc_B = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
optimizer_gen_AB = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
optimizer_gen_BA = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
To monitor cycle gan training progress, losses will be stored in the following lists.
loss_discA = []
loss_discB = []
loss_gen_AB = []
loss_gen_BA = []
loss_gen_adv_AB = []
loss_gen_adv_BA = []
loss_gen_cycle_ABA = []
loss_gen_cycle_BAB = []
loss_gen_iden_AB = []
loss_gen_iden_BA = []
Loss functions
For adversarial loss, BCE or MSE are selected. For other criterion, MAE is the choice.
BCE_loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
MSE_loss = tf.keras.losses.MeanSquaredError()
MAE_loss = tf.keras.losses.MeanAbsoluteError()
adv_loss_function = BCE_loss #MSE_loss
train_CycleGan trains discriminator and generator separately. To keep balance between generator and discriminator, it does not train disciminator when loss is less than threshold. After training discriminators, it traing generators.
Train_OneEpoch function contains train_CycleGan in for loop. It reads Monet image and Landscape image, then input those images into training. It also has horizontal flip of Monet image as data augmentation, because there are only 300 of Monet images. Everytime of training, it randomly select Monet and Landscape. Thus, combination of two images changes by every training routine.
def train_CycleGan(real_A, real_B):
lambda_cycle = 10
lambda_identity = 2
lambda_adv = 1
# Threshold of training discriminator. It skips training discriminator when loss < threshold
disc_train_threshold = 0.35
with tf.GradientTape() as tape:
#fake
fake_A = gen_BA(real_B)
#pred by disc
real_A_pred = disc_A(real_A)
fake_A_pred = disc_A(fake_A)
#disc loss
disc_loss_A = get_disc_loss(real_A_pred, fake_A_pred, adv_loss_function)### MSE or BCE
loss_discA.append(disc_loss_A)
if disc_loss_A > disc_train_threshold:
gradients = tape.gradient(disc_loss_A, disc_A.trainable_weights)
optimizer_disc_A.apply_gradients(zip(gradients, disc_A.trainable_weights))
with tf.GradientTape() as tape:
#fake
fake_B = gen_AB(real_A)
#pred by disc
real_B_pred = disc_B(real_B)
fake_B_pred = disc_B(fake_B)
#disc loss
disc_loss_B = get_disc_loss(real_B_pred, fake_B_pred, adv_loss_function)### MSE or BCE
loss_discB.append(disc_loss_B)
if disc_loss_B > disc_train_threshold:
gradients = tape.gradient(disc_loss_B, disc_B.trainable_weights)
optimizer_disc_B.apply_gradients(zip(gradients, disc_B.trainable_weights))
with tf.GradientTape(persistent=True) as tape:
#fake
fake_A = gen_BA(real_B)
fake_B = gen_AB(real_A)
#cycle
cycle_A = gen_BA(fake_B)
cycle_B = gen_AB(fake_A)
#identity
iden_B = gen_AB(real_B)
iden_A = gen_BA(real_A)
#pred by disc
fake_B_pred = disc_B(fake_B)
fake_A_pred = disc_A(fake_A)
#gen loss
gen_adversarial_loss_AB = get_gen_adversarial_loss(fake_B_pred, adv_loss_function)### MSE or BCE
gen_adversarial_loss_BA = get_gen_adversarial_loss(fake_A_pred, adv_loss_function)### MSE or BCE
loss_gen_adv_AB.append(gen_adversarial_loss_AB)
loss_gen_adv_BA.append(gen_adversarial_loss_BA)
#cycle loss
cycle_loss_A = get_cycle_consistency_loss(real_A, cycle_A, MAE_loss)
cycle_loss_B = get_cycle_consistency_loss(real_B, cycle_B, MAE_loss)
total_cycle_loss = cycle_loss_A + cycle_loss_B
loss_gen_cycle_ABA.append(cycle_loss_A)
loss_gen_cycle_BAB.append(cycle_loss_B)
#identity loss
identity_loss_AB = get_identity_loss(real_B, iden_B, MAE_loss)
identity_loss_BA = get_identity_loss(real_A, iden_A, MAE_loss)
loss_gen_iden_AB.append(identity_loss_AB)
loss_gen_iden_BA.append(identity_loss_BA)
#gen loss
#gen_loss = gen_adversarial_loss*lambda_adv + cycle_loss*lambda_cycle + identity_loss*lambda_identity
gen_loss_AB = gen_adversarial_loss_AB*lambda_adv + total_cycle_loss*lambda_cycle + identity_loss_AB*lambda_identity
gen_loss_BA = gen_adversarial_loss_BA*lambda_adv + total_cycle_loss*lambda_cycle + identity_loss_BA*lambda_identity
loss_gen_AB.append(gen_loss_AB)
loss_gen_BA.append(gen_loss_BA)
gradients = tape.gradient(gen_loss_AB, gen_AB.trainable_weights)
optimizer_gen_AB.apply_gradients(zip(gradients, gen_AB.trainable_weights))
gradients = tape.gradient(gen_loss_BA, gen_BA.trainable_weights)
optimizer_gen_BA.apply_gradients(zip(gradients, gen_BA.trainable_weights))
del tape
gc.collect()
def read_img(dir_name, file_name, convert = False):
img = cv2.imread(dir_name + file_name)[:,:,::-1]
if convert:
img = img/(255/2) - 1
return img
def Train_OneEpoch(EPOCH):
time1 = time.time()
print("EPOCH ", EPOCH)
count = 0
np.random.seed(EPOCH)
m_idx = np.arange(n1)
np.random.shuffle(m_idx)
f_idx = np.arange(len(files_photo))
np.random.shuffle(f_idx)
for i in range(n1):
F_data = read_img(dir_photo, files_photo[f_idx[i]], convert = True)
M_data2 = M_data[m_idx[i]]
#Horizontal Flip of Monet image
if np.random.rand(1)[0] > 0.5:
train_CycleGan(M_data2[:,::-1,:].reshape(1,256,256,3), F_data.reshape(1,256,256,3))
else:
train_CycleGan(M_data2.reshape(1,256,256,3), F_data.reshape(1,256,256,3))
if (i+1) % 100 == 0:
print(i+1)
time2 = time.time()
time3 = np.round(time2 - time1)
print("time ", time3, "sec")
check_output visualize training progress after each epoch. It shows original Landscape image, converted image, cycled image, and identity image.
def check_output(EPOCH, plot_line, plot_all = False):
real_A = M_data[0:36]
real_B = F_data[0:36]
fake_B = gen_AB(real_A)
fake_A = gen_BA(real_B)
cycle_A = gen_BA(fake_B)
identity_A = gen_BA(real_A)
cycle_B = gen_AB(fake_A)
identity_B = gen_AB(real_B)
tag = "_" + str(EPOCH)
#np.save("real_A", convert_img(real_A))
np.save("real_B", convert_img(real_B))
#np.save("fake_B" +tag, convert_img(fake_B))
#np.save("cycle_A" +tag, convert_img(cycle_A))
#np.save("identity_A" +tag, convert_img(identity_A))
np.save("fake_A" +tag, convert_img(fake_A))
np.save("cycle_B" +tag, convert_img(cycle_B))
np.save("identity_B" +tag, convert_img(identity_B))
tag = ", epoch " + str(EPOCH)
if plot_all:
#plot_photo(fake_B)
plot_photo(fake_A)
if plot_line:
#plot_OneLine(real_A, "real_A")
#plot_OneLine(fake_B, "fake_B " + tag)
#plot_OneLine(cycle_A, "cycle_A " + tag)
#plot_OneLine(identity_A, "identity_A " + tag)
plot_OneLine(real_B, "real_B")
plot_OneLine(fake_A, "fake_A " + tag)
plot_OneLine(cycle_B, "cycle_B " + tag)
plot_OneLine(identity_B, "identity_B " + tag)
def Train_Loop(EPOCH, EPOCH_prev):
plot_line = True
for i in range(EPOCH):
Train_OneEpoch(i + EPOCH_prev)
plot_line = False
if (i + EPOCH_prev) % 1 == 0:
plot_line = True
check_output(i + EPOCH_prev, plot_line, False)
Following hyperparameters are tuned in previous versions of this notebook. They are also based on Coursera's GAN Specialization [1] and CycleGan paper [2].
I selected LeakyRelu for all activation function in generator and discriminator. GAN Specialization[1] uses Relu for Generator's ContractingBlock and ResidualBlock, but Relu made dark parts of generated images into completely black (blackout), because it cut off small values. In contrast, LeakyRelu preserved information in dark zones, and generated image did not have blackouts.
When number of ContractingBlock is 2, training was stable and fast. Increasing ContractingBlocks did not improve result. It ended up with blurred image with larger training epochs.
Number of ResidualBlock is 3 in final model. Even when there is more than 3 ResidualBlocks, it got the similar result (appearance) of generated image.
Kernel size = 5 was more stable than 4 which I have seen in GAN Specialization.
Learning rate is set at 0.0002 as per CycleGAN paper[2]. Increasing learning rate did not improve the result.
GAN Specialization [1] recommends MSE for adversarial loss to avoid vanishing gradient. However, BCE works slightly better than MSE for my GAN architecture of this project. Hence, BCE is selected for adv loss. Furthermore, BCE is easier to monitor occourence of overfit.
During training, CycleGAN is validated visually by plotting fake image, cycled image, and identity image from perspective of image structure, color, texture, etc. Furthermore, train loss is plotted to make sure the mode collapse (over fitting) does not occour.
N_EPOCH = 4
Train_Loop(N_EPOCH, 0)
Train_Loop(N_EPOCH, 4)
Train_Loop(N_EPOCH, 8)
EPOCH 8 100 200 300 time 478.0 sec EPOCH 9 100 200 300 time 477.0 sec EPOCH 10 100 200 300 time 493.0 sec EPOCH 11 100 200 300 time 487.0 sec
fig, ax = plt.subplots(3, figsize = (10, 12))
#cut off the first 40 train loss
start = 0
ax[0].plot(loss_discA[start:], label ="disc_A", alpha = 0.7, lw = 0.6)
ax[0].plot(loss_discB[start:], label = "disc_B", alpha = 0.7, lw = 0.6)
ax[0].plot(loss_gen_adv_AB[start:], label = "gen_AB", alpha = 0.7, lw = 0.6)
ax[0].plot(loss_gen_adv_BA[start:], label = "gen_BA", alpha = 0.7, lw = 0.6)
ax[0].set_title("Adversarial Loss")
ax[0].set_ylim(0,2)
#ax[1].plot(np.array(loss_gen_adv), label = "adv", alpha = 0.8)
ax[1].plot(loss_gen_cycle_ABA[start:], label ="cycle_ABA", alpha = 0.7, lw = 0.6)
ax[1].plot(loss_gen_cycle_BAB[start:], label ="cycle_BAB", alpha = 0.7, lw = 0.6)
ax[1].plot(loss_gen_iden_AB[start:], label = "identity_AB", alpha = 0.7, lw = 0.6)
ax[1].plot(loss_gen_iden_BA[start:], label = "identity_BA", alpha = 0.7, lw = 0.6)
ax[1].set_title("Generator Loss")
ax[2].plot(loss_gen_AB[start:], label = "gen_AB", alpha = 0.7, lw = 0.6)
ax[2].plot(loss_gen_BA[start:], label = "gen_BA", alpha = 0.7, lw = 0.6)
ax[2].set_title("Generator Loss (total)")
for i in range(3):
ax[i].grid()
ax[i].legend()
36 samples of final output image are shown below. It has different color from the original image.
check_output(N_EPOCH, False, True)
This model got FID 74.5 in version 24 of this notebook. It achieved the target of this project FID < 100. The final output has faded color which is different from original images. Texture looks like slightly rougher than original, yet does not look like Monet. However, I could check that CycleGAN worked as it should be:
other takeaways
To implement CycleGAN, I had to study in GAN Specialization[1]. Its problem was that all assignemnts are in Pytorch although I prefer coding in tensorflow. Therefore, I had to re-code everything in tensorflow which was good experience to verify my understanding.
[1] Apply Generative Adversarial Networks (GANs). Deeplearning.ai
[2] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (Zhu, Park, Isola, and Efros, 2020):
import PIL
! mkdir ../images
def submit_files():
n3 = len(files_photo)
for i in range(n3):
real_F = cv2.imread(dir_photo + files_photo[i])[:,:,::-1].reshape(1,256,256,3)
real_F = real_F/(255/2) - 1
fake_A = np.array(gen_BA(real_F))[0]
prediction = convert_img(fake_A).astype(np.uint8)
im = PIL.Image.fromarray(prediction, mode="RGB")
im.save("../images/" + str(i) + ".jpg")
submit_files()
import shutil
shutil.make_archive("/kaggle/working/images", 'zip', "/kaggle/images")
'/kaggle/working/images.zip'